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Abstract 

Background: Unexpected obstetric emergencies threaten the safety of pregnant women. As emergencies are rare, 
they are difficult to learn. Therefore, simulation-based medical education (SBIVIE) seems relevant. In non-systematic 
reviews on SBME, medical simulation has been suggested to be associated with improved learner outcomes. However, 
many questions on how SBIVIE can be optimized remain unanswered. One unresolved issue is how "m situ simulation' 
(ISS) versus 'off site simulation' (OSS) impact learning. ISS means simulation-based training in the actual patient care 
unit (in other words, the labor room and operating room). OSS means training in facilities away from the actual patient 
care unit, either at a simulation centre or in hospital rooms that have been set up for this purpose. 

Methods and design: The objective of this randomized trial is to study the effect of ISS versus OSS on individual 
learning outcome, safety attitude, motivation, stress, and team performance amongst multi-professional 
obstetric-anesthesia teams. 

The trial is a single-centre randomized superiority trial including 100 participants. The inclusion criteria were health-care 
professionals employed at the department of obstetrics or anesthesia at Rigshospitalet, Copenhagen, who were working 
on shifts and gave written informed consent. Exclusion criteria were managers with staff responsibilities, and staff who 
were actively taking part in preparation of the trial. The same obstetric multi-professional training was conducted in the 
two simulation settings. The experimental group was exposed to training in the ISS setting, and the control group in the 
OSS setting. The primary outcome is the individual score on a knowledge test. Exploratory outcomes are individual 
scores on a safety attitudes questionnaire, a stress inventory, salivary Cortisol levels, an intrinsic motivation inventory, 
results from a questionnaire evaluating perceptions of the simulation and suggested changes needed in the 
organization, a team-based score on video-assessed team performance and on selected clinical performance. 

Discussion: The perspective is to provide new knowledge on contextual effects of different simulation settings. 

Trial registration: ClincialTrials.gov NCTOl 792674. 
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Background 

Care for pregnant and parturient women is a field where 
unexpected emergencies occur; for example, emergency 
Caesarean section, postpartum bleeding or severe pre- 
eclampsia, that may potentially harm both mother and 
baby [1-4]. Since obstetric emergencies are rare and hence 
by nature difficult to learn in real life, simulation-based 
medical education (SBME) is argued to be an essential 
remedy [5]. SBME is defined as "a person, device, or set of 
conditions which attempts to present education and evalu- 
ation problems authentically. The student or trainee is re- 
quired to respond to the problems as he or she would 
under natural circumstances" [6] . 

Labor wards have a dual function in creating a relaxed 
atmosphere for normal childbirth and at the same time 
showing readiness to deal with life-threatening emergen- 
cies [7]. Labor wards are challenging work places and 
patient safety and medical litigation are high on the 
agenda [8-11]. In certain situations, clinical management 
of pregnant and parturient women may require the in- 
volvement of a variety of health-care professionals and 
medical specialties. The primary care team in a delivery 
room consists of a midwife assisted by an auxiliary 
nurse. In cases of emergencies, more experienced mid- 
wives and obstetricians will be called for assistance. If 
the clinical situation progresses further to an emer- 
gency, an anesthesiologist, a nurse anesthetist and the 
operating room personnel may become involved. Occa- 
sionally, involvement of other specialties may also be 
required, when a rather common clinical event has 
evolved into a potentially life-threatening situation call- 
ing for multi-professional and multi-disciplinary clinical 
management. 

Such rare and complex clinical situations require com- 
plex skills, which cannot be trained and learned in clinical 
practice. Thus, there is a need for SBME in obstetric emer- 
gencies. In a systematic review of training in acute obstetric 
emergencies [12] the authors applied the quality assess- 
ment of diagnostic accuracy studies criteria. Out of 97 arti- 
cles, only eight articles - four randomized trials and four 
cohort studies - assessing the effect of teamwork training 
in a simulation setting were identified. Based on these 
trials, it was concluded that teamwork training in a 
simulation-based setting resulted in improvements in 
knowledge, practical skills, communication, and team per- 
formance in acute obstetric situations. No difference in 
outcomes was found when comparing SBME in a dedi- 
cated simulation centre with SBME in a local hospital set- 
ting [13,14]. 

From the non-systematic reviews on SBME [6,15] and 
the obstetric systematic review [12] we can conclude that 
SBME in labor wards is worthwhile, and that multi- 
professional and multi-disciplinary team training are im- 
portant approaches due to the complexities of the trained 



skills and the rarity of the high-risk obstetric emergencies. 
However, we need to further study key elements of SBME 
in order to fully understand how we can best improve 
SBME in obstetric emergencies. One potential element 
influencing the effect of simulation might be the level of 
authenticity of the simulation or, in other words, the fi- 
delity of the simulation. Fidelity is described as a multi- 
dimensional concept consisting of different parts: 1) 
physical/functional or engineering fidelity, which mean 
the degree to which the simulator duplicates the appear- 
ance and perception of the real system; 2) psychological 
fidelity is the degree to which the trainee perceives the 
simulation to be an authentic surrogate for the trained 
task. The literature states that psychological fidelity is 
considered to be the most essential requirement when 
conducting team training [16,17]. The simulation set- 
ting has traditionally been 'off site simulation' (OSS), ei- 
ther at a simulation centre or in local facilities in the 
hospital set up for the single purpose of simulation 
training. However, more recently, a new simulation mo- 
dality, the 'in situ simulation' (ISS), has been introduced. 
ISS is described by Riley and colleagues [18] as "a team- 
based simulation strategy that occurs on the actual patient 
care units involving actual healthcare team members 
within their own working environment". An unanswered 
question is whether ISS is superior compared with OSS 
with regards to simulation-based learning in obstetric 
emergencies? We hypothesized that the psychological fi- 
delity is influenced by the setting in which the simulation 
training is conducted, and that ISS can add to the level of 
fidelity and therefore be more effective. 

Apart from a few larger observational studies within dif- 
ferent medical specialties [18-20], most of the studies 
conducted on ISS describe a local educational intervention 
with a local ISS program. Methodologically, the studies 
are descriptive and few include a control group or pre- 
and post-tests, and we have not been able to identify any 
randomized trials [21]. It is argued that ISS can identify 
system weaknesses because ISS takes place in the real 
working environment and, therefore, potentially has more 
psychological fidelity as opposed to OSS [18-22], and ISS 
can be used to test how new processes are functioning in 
clinical facilities [23]. Some argue that ISS overcomes 
feasibility issues and is cost saving compared to OSS in 
simulation centers [24,25]. ISS can consist of either an 
announced training event or an unannounced event. An- 
derson and colleagues [26] focused on unannounced ISS 
and its potential disadvantages, and argued how un- 
announced ISS is time consuming and may intimidate 
participants. 

Human factors such as stress and motivation impact 
learning [27-31]. Studies show that simulation can be a 
stressor. High stress responsiveness has been associated 
with both enhanced and impaired performance, but with 
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enhanced learning [29]. As such, further exploration of 
these issues is needed. Experimental studies have used 
unspecific measurements of stress level [32], and differ- 
ent stress inventories as well as measurements of saliv- 
ary Cortisol levels [28,33-36]. Motivational processes are 
central to learning [27,30,37], and as part of this trial we 
will investigate phenomena such as intrinsic motivation, 
and how this is moderated by the two different training 
settings (ISS versus OSS). We hypothesize that, in 
simulation-based training in obstetric emergencies, ISS 
is more effective than OSS regarding learning. We 
anticipate that the participants will experience ISS as 
more demanding, and that ISS will create higher levels 
of stress and motivation, which may yet again enhance 
learning. Further, we hypothesize that ISS training may 
provide the investigators with more information on 
changes needed in the organization than OSS training 
will. Randomized trials are needed to obtain knowledge 
on the effect of ISS versus OSS on participants and its 
advantages and disadvantages. 

Methods and design 

The design is a single-center, investigator-initiated ran- 
domized superiority trial. 

The setting 

The trial was undertaken at Rigshospitalet, Copenhagen 
University Hospital, in an obstetric and anesthesia high- 
risk department with approximately 6,600 deliveries per 
year. The intervention period was scheduled to run 
through April to June 2013, and follow-up by question- 
naires until August 2013. 

Participants 

All health-care professionals from the department of 
obstetrics and anesthesia, Juliane Marie Centre for 
Children, Women and Reproduction, Rigshospitalet, 
working on or in relation to the labor ward, were eligible 
for inclusion in the trial. These health-care professional 
groups, who were working on shift, encompassed: spe- 
cialized obstetricians; trainee obstetricians; midwives; 
specialized midwives; auxiliary nurses; specialized anes- 
thesiologists; trainee anesthesiologists; nurse anesthe- 
tists; and surgical nurses. Participants gave informed 
consent. Exclusion criteria were lack of informed con- 
sent, employees with managerial and staff responsibil- 
ities, staff members involved in the design or conduction 
of the trial, and finally employees who did not work 
in shifts. 

Randomization 

Randomization was done by the Copenhagen Trial Unit, 
using a computer-generated allocation sequence concealed 
to the investigators. The randomization was conducted in 



two steps. The participants were individually randomized 
into the experimental group (ISS) or the control group 
(OSS). The allocation sequence was stratified according to 
health-care professional groups in order to resemble au- 
thentic teams and according to the days they were avail- 
able for training. After individual randomization, the 
participants in either group (ISS and OSS) were random- 
ized into five teams each. 

Trial interventions 

This trial included an experimental educational inter- 
vention ISS [18,21], which means training in the actual 
patient care unit (in other words, the labor room and 
operating theatre). The experimental intervention in the 
present trial was pre-announced ISS. We planned to 
conduct announced ISS training, as the complexity of 
conducting unannounced ISS sessions with the involve- 
ment of health-care professionals on a larger scale is 
unrealistic taking into consideration work schedules 
and the daily clinical work activities. Training of the 
control group (OSS) took place in training rooms that 
were set up for the occasion in the hospital, but away 
from the actual patient care unit. 

The simulated scenarios applied in the trial were 
contained in a full training day. The development of the 
curriculum for the training day was based on an instruc- 
tional design approach [38,39] and was developed and pilot 
tested by a local multi-professional working committee. In 
January 2012 this working committee was appointed by 
the managerial groups of the departments of anesthesia 
and obstetrics and consisted of representatives from all the 
health-care professionals who will participate in the trial. 
This working committee developed aims and objectives 
based on the principles of Blooms taxonomy [40], and the 
aims and objectives were approved by the management 
groups. The simulated scenarios in ISS and OSS were 
designed in a way that involved both the labor room set- 
ting and the operating theatre, to specifically focus on the 
patient journey and the communication amongst health- 
care professionals during patient transfers, where many 
different health-care professionals and different medical 
disciplines are involved. This approach to training was 
chosen based upon previous experience with obstetric 
simulation-based training conducted in the obstetric de- 
partment [32,41] and was designed in accordance with the 
overall plan of strategy of the obstetric department and the 
anesthesia department, Juliane Marie Centre for Children, 
Women and Reproduction, Rigshospitalet [42] . 

In the labor room, a simulated patient acted as the pa- 
tient. In the operating room, a full body interactive 
birthing simulator, a SimMom, was the patient [43]. The 
SimMom simulator offers the functionality required for 
training in a wide range of midwifery, obstetric and 
anesthesia skills, and the anatomy and functionality of 
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the SimMom allows for multi-professional training of 
labor and delivery management. Standardized clinical sim- 
ulated scenarios were designed and, combined with pre- 
programmed scenarios on the SimMom, this allowed for 
standardized training. The educators were recruited from 
the local working committee, and all educators were 
trained to run the scenarios in a standardized way and in 
facilitating the simulation scenarios and debriefing the 
participants. 

In addition, the training day also included some video- 
based, case-based and lecture-based teaching sessions. 
Also, on the simulation days, data related to the trial were 
collected in the form of written multiple choice questions 
(MCQ), questionnaires on subjective stress and salivary 
Cortisol samples (Table 1). Training days were scheduled 
into the individual employee's working plan. Figure 1 gives 
an overview of the randomization and intervention pro- 
cedure as well as outcomes. 

Blinding 

The participants and the educators providing the educa- 
tional intervention, and the assessors observing and 
assessing videos, were not blinded to the intervention. 
The data managers, statisticians and investigators draw- 
ing conclusions will be blinded to the allocated interven- 
tion groups. 

Measurements and assessment of outcomes 

Table 2 provides an overview of the variables, outcomes 
and accompanying statistical analyses. Table 2 is inspired 
by the SPIRIT 2013: Explanation and elaboration: guid- 
ance for protocols of clinical trials [44]. See Table 1 for 
the time schedule for obtaining measurements. 

Primary outcome 

The primary outcome is knowledge test results from 
MCQs. The mean values of results of the MCQs of the 
experimental and control group were tested at the end 
of the training day and will be compared. 

Previous research on knowledge testing has found that 
written tests are able to predict results in performance- 
based testing [45,46]. The argument for applying the 
MCQ is that it is feasible to test many participants in a 
relatively short time and at low costs [45]. Previously 
used MCQ tests and 'knowledge of skills test' [13,32] 
were used for inspiration, when constructing this new 
MCQ. 

The MCQs were created as a 'one-best-answer' item 
format with three to five options, which requires the 
participants to select the single best response [47,48]. 
The content of the MCQs were based on aims and ob- 
jectives developed by the multi-professional working 
group appointed by the management and has been 
tested amongst all health-care professional in this local 



working group. The content validity was further tested 
among specialized obstetricians and specialized obstetric 
anesthesiologists. Subsequently, the MCQs were tested 
among midwifery students, medical students, trainee 
doctors and specialized obstetricians and specialized ob- 
stetric anesthesiologists from other hospitals and were 
found to be construct valid. During the statistical ana- 
lysis, some items in the MCQ needed to be deleted. The 
description of development and results of the MCQs 
used in this trial will be reported in another publication. 

Exploratory outcomes 

The "Safety Attitudes Questionnaire" (SAQ) consists of 
33 items on a five-point scale that is divided into six 
dimensions. SAQ was applied approximately 1 month 
prior to and approximately 1 month after the training 
day. The mean values on six different dimensions of the 
SAQ results from the experimental and control group 
will be compared 1 month after the training day. SAQ is 
an inventory used in several countries and also applied 
and validated in a Scandinavian context, and previously 
tested in Denmark [49-51]. 

Salivary Cortisol (reflecting the hypothalamic-pituitary-ad- 
renal axis activity) was used as a biological marker of stress 
levels. The Sarstedt Cortisol Salivette Device, provided by 
Neogen corporation 944 Nandino Blvd, Lexint KY 40511- 
1205 USA Product no. 402710, was used The analysis will 
be a duplicate analysis based on the Elisa Technique 405 
nm, where 100 ul sample will be extracted and 50 ul in du- 
plication will be used in the ELISA kit. Eight standards in 
ranges from 0.04 ng/ml up to lOng/ml will be used in the 
assay together with a blanc control. The microtitterplate 
will be read in a dual wavelength set at 450 nm and 650 
nm. In calculation of the data, the blanc background will be 
subtracted from all absorbance values before a non linear 
fit to the standard curve will be calculated. 

The salivary Cortisol sample was obtained before the 
simulation (baseline) and three times in relation to the 
simulations. The Cortisol response will be measured as 
increased salivary Cortisol from individual baseline to 
peak, and mean response values in the experimental and 
control group will be compared. 

State-Trait Anxiety Inventory (STAI-1) was adminis- 
tered before the simulations started (baseline) and twice 
following the simulations [52,53]. It will reflect the sub- 
jective stress response. The peak level of subjective stress 
response will be used and mean values in the experi- 
mental and control group will be compared. 

Cognitive appraisal [31,36,54] was assessed before and 
after each scenario, using the method described by 
Tomaka [54]; in other words, primary appraisal was ex- 
amined by asking the participants to answer the ques- 
tion "how stressful do you expect the upcoming task to 
be?" Secondary appraisal was measured by asking the 



Table 1 Time schedule of measurements 



Individual measurements 



Team measurements 



Multiple choice Safety attitudes Stress-trait 
questionnaire questionnaire anxiety inventory 



Cognitive 
appraisal 



Test for salivary Intrinsic motivation Evaluation Team emergency 
Cortisol inventory questionnaire assessment measure 



Training day 
start of day 



Training day 
end of tlie day 



4 weeks before 
training day 



4 weeks after 
training day 



Training day 
before 1. simulation 
and twice after 



Training day Training day before 1 week after 1 week after 

before 1 . simulation 1 . simulation and tliree training day training day 

and twice after times after 



Training day before 2. Training day before Training day before 2. 
simulation and 2. simulation and simulation and three 

twice after twice after times after 



Training day 

T simulation: video 
recordings. Video 
assessment by 



Selected clinical 
measures 

Training day 

1. simulation: video 
recordings. Video 
assessment by independent 



independent assessors, assessors. 



Training day 

2. simulation: video 
recordings. Video 
assessment by 
independent assessors 



Training day 

2. simulation: video recordings. 
Video assessment by 
independent assessors 
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Information to aii eligibie staff members: Obstetric and anaesthesia doctors, anaesthesia and scrub nurses, 
auxiliary nurses, midwifes from obstetric department, anaesthesia and operative department 



Accept after written informed consent. 1 00 staff members randomised in authentic multi-professional teams of ten 



HI 



Safety Assessment Questionnaire (one month before the training day) 



Baseline knowledge written (at training day) 
During simulation: stress inventory + cognitive appraisal 
Measurement of salivary Cortisol 



Baseline knowledge written (at training day) 
During simulation: stress inventory + cognitive appraisal 
y Measurement of salivary Cortisol 



1. ISS setting in labour suites, operation theatre. 
Scenario video filmed for analysis N=50 



I 1 . Debriefing j 



1. OSS setting in training rooms. 
Scenario video filmed for analysis N=50 

I 1 . Debriefing j 



2. ISS in same teams 

Scenario video filmed for analysis N= 50 

2. Debriefing 



2. OSS in same teams. 

Scenario video filmed for analysis N=50 

"7' 

2. Debriefing 



Post knowledge written test (end of training day) 




Post knowledge written test (end of training day) 


Questionnaire about perceptions of simulations, debriefing and ideas about organisational changes. 
Intrinsic Motivation Inventory (one week after the training day) 



I 

I Safety Assessment Questionnaire (one month after the training dayl) | 

Figure 1 Randomized trial of 'in situ simulation' (ISS) versus 'off site simulation' (OSS): randomization, intervention and 
outcome measurements. 



participants "how able were you to cope with this task?" 
The participants indicated their answers on an anchored 
ten-point Likert scale. An index of cognitive appraisal 
will be calculated as the ratio of the primary appraisal 
(task) to the secondary appraisal (resource). If the re- 
sources are assessed as being greater than the task de- 
mands, the situation is appraised as a 'challenge'. If the 
task demands were appraised as being greater than the 
resources, the situation is appraised as a 'threat' [54]. 

"Intrinsic Motivation Inventory" consists of 22 items on 
a seven-point scale that is divided into four dimensions. It 
was administered as a questionnaire approximately 1 week 
after the training day [27]. The median values in the ex- 
perimental and control group will be compared. 

A questionnaire was administered to evaluate partici- 
pant perceptions of the simulations and the debriefing 
approximately 1 week after the training day. This ques- 
tiormaire included questions on a Likert scale about per- 
sonal perceptions of the scenario (that is, learning, 
realism, cooperation between health-care professionals, 
own role in the team, et cetera) and whether the simula- 
tion training scenarios inspired the participants to suggest 
organizational change proposals (that is, changes in guide- 
Unes, practical things, et cetera). The data will be treated 
as ordinal data at the item level. The median values in the 
experimental and control group will be compared. 

Team performance score will be assessed by independ- 
ent observers through reviewing video recordings of the 
scenarios. A validated rating scale "Team Emergency As- 
sessment Measure" developed by Cooper and colleagues 



[55,56] will be used. The median scores of the perform- 
ance in the experimental and control group will be 
compared. 

Clinical performance in the simulated setting will be 
assessed by the independent assessors through the 
reviewing of video recordings of the scenarios. The as- 
sessment score is based on data such as minutes passed 
from the scenario starts till decision was made about 
operation, minutes from decision making before oper- 
ation was initiated, and whether medications such as 
uterotonics were administered or not. The mean score 
of the performance in the experimental and control 
group will be compared. 

Sample size calculation 

There are no data on training effectiveness of ISS upon 
which to base sample size calculations. We chose to calcu- 
late the required sample size based on experience with 
knowledge tests from data in previous studies [13,32]. We 
were planning a trial of a continuous response variable 
from independent control and experimental participants 
with one control per experimental participant. We as- 
sumed the response within the experimental and the con- 
trol group to be normally distributed with a standard 
deviation of 24%. If the true difference in the experimental 
and control means was 17%, we needed to study 32 ex- 
perimental participants and 32 control participants (a total 
of 64) to be able to reject the null hypothesis; that is, that 
there was no difference in population means of the 



Table 2 Variables, research hypothesis, outcome measures and methods of statistical analysis 



3 § 



Variable/outcome on 
individual level (N = 100) 



Research hypothesis: experimental 
group versus control group 



Outcome measure 



Type of variable 



Methods of 
statistical analysis 



Primary outcome 

Multiple choice questions 



Exploratory outcome 

Safety Attitudes Questionnaire 



Improvement occurs in 
the experimental group 



Increased score in the 
experimental group 



Increased peak score in 
the experimental group 



Stress-Trait Anxiety Inventory 
Baseline 

Stress-Trait Anxiety Inventory 1 

Stress-Trait Anxiety Inventory 2 

Cognitive appraisal A Baseline Increased peak score in 
the experimental group 

Cognitive appraisal 1 
Cognitive appraisal 2 

Test for salivary Cortisol Baseline Increased salivary Cortisol level from 

^ . ,. . , , baseline to peak in the experimental group 

Test for salivary Cortisol 1 

Test for salivary Cortisol 2 

Test for salivary Cortisol 3 

Evaluation questionnaire Increased positive evaluation 

in the experimental group 



Percentage correct in 40 
multiple choice questions 



33 items on a 5-point scale. 
Divided into 5 dimensions 

Data are converted to the 100-point scale 
Inventory 20 item (interval 20 to 80). 



Likert scale 1 (10 point)/Likert scale 2 
(10 point) (interval 1/10 to 10) 



Cortisol level in nmol/l 



20 questions on a 5-point Likert scale 



Intrinsic Motivation Inventory Increased score in the 
experimental group 



Variables on team-level 
(N = 10 teams) 

Team Emergency Assessment Improved outcome in the 



Measure 



Selected clinical measures 



experimental group 



Task evaluation inventory 22 items on a 
7-point scale. Divided into 4 dimensions 



Video assessment on a 5-point scale (0 to 5) 
of 1 1 questions (0 to 44) 1 0 points scale for 
global rating of the team 



Will be analyzed as interval data, Parametric techniques 
a Gaussian distribution is expected 

ANOVA 



Will be analyzed as interval data, Parametric techniques 

a Gaussian distribution is expected . . 

^ ANOVA 

Chi-square tests 

Will be analyzed as interval data. Parametric techniques 



a Gaussian distribution is expected 



Ordinal data 



Interval data 



Ordinal data 



Ordinal data 



Improved outcome in the experimental Minutes before decision making about operation, minutes before 
group operation initiated. Medication given yes/no 



Ordinal data 



Interval data 



ANOVA 



Non-parametric 
techniques 

Mann Whitney U test 



Parametric techniques 
ANOVA 



Non-parametric 
techniques 

Treated as ordinal data 
at the item level 

Non-parametric 
techniques 

Mann Whitney U test 



Non-parametric 
techniques 

Mann Whitney U test 

Parametric techniques 



9L S 

S w 

3 -. 

O NJ 

3 NJ 



2 



The table is inspired by SPIRIT 201 3: Explanation and elaboration: guidance for protocols of clinical trials [44]. ANOVA, analysis of variance. 
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experimental and control groups with a probability of 
(power) 80%. The two-sided type I error probability asso- 
ciated to test this null hypothesis was 5%. 

Sample size estimation adjusted for clustering 

As the intervention was delivered in teams (clusters), ob- 
servations on participants in the same team were likely 
to be correlated. Hence the effective sample size was less 
than that suggested by the actual number of individual par- 
ticipants. The reduction in effective sample size depends 
on the intra-class or cluster correlation coefficient (ICC) 
[57,58]. In order to adjust the sample size for this, the 
crude sample size calculated above needed to be multiplied 
by the design effect. The cluster size was ten, as there were 
ten participants in each team, and we assumed the ICC to 
be 0.05 [58]. Design effect = 1 + (cluster size - 1) x ICC 
design effect = 1.45. Accordingly, the sample size was then 
64 X 1.45 = 92.8 participants. We therefore planned to in- 
clude 100 participants in the experimental and control 
groups (50 in each group) each of which consists of five 
teams of 10 participants in each arm. Statistical methods 
for the primary and exploratory outcome of the hypotheses 
are laid out in Table 2, and also the statistical methods are 
described. The intervention was delivered in teams, which 
means that participants were clustered within teams. Since 
observations from individuals in the same team are poten- 
tially correlated we will use generalized estimating equa- 
tions (GEE) [59] in the parametric analyses to take this 
cluster effect into account. The statistical analysis will be 
adjusted for health-care professional groups. The experi- 
mental group (participants or teams in ISS) will be com- 
pared against the control group (participants or teams in 
OSS) for all analyses. The results will be expressed by 
means with standard deviations and confidence intervals, 
as well as by medians with percentiles. Associated P-values 
and effect sizes will be reported. 

For the interval scale data, linear regression will be 
used to analyze changes between the experimental and 
the control group from baseline to peak. GEE will be 
used to take the clustered nature of the data into ac- 
count. Non-parametric statistical analyses will be used 
for the ordinal scale data. Medians and percentiles will 
be reported and the Mann- Whitney U test will be used. 
Individual responses to the evaluation questionnaire are 
measured on a Likert scale and will be treated as ordinal 
data and analyzed at the item level. 

To take missing data into account, all analyses will be 
performed as intention-to-treat analyses. Missing data 
will be handled by multiple imputation techniques. 
For all tests, we will use 2-sided P-values with alpha < 
0.05 being the level of significance. We will use the 
Benjamin-Hochberg method to adjust for multiple 
testing [60]. 



Ethical consideration 

Participants are health-care professionals and neither pa- 
tients nor patient data are used in the trial. The trial com- 
plies with the current version of the Declaration of Helsinki 
on biomedical research and with the Act on Processing of 
Personal Data. Relevant approval from The Regional Ethics 
Committee (protocol number H-2-2012-155) and the 
Danish Data Protection Agency (Number 2007-58-0015) 
are obtained. The trial is registered at www.clinicaltrials.gov 
with number NCT01792674. 

The training program was planned to take place dur- 
ing normal working hours and participants were paid 
full salary for their attendance. No further compensation 
was given to participants. Participation was voluntary 
and the participants could withdraw from the trial at 
any time. 

Participants were assured that their personal data, data 
on questionnaires, salivary Cortisol samples and video- 
recordings will remain anonymous during analyses and 
reporting. The participants were asked to respect the confi- 
dentiality of their observations about colleagues' perform- 
ance in the simulated setting. 

Recruitment of participants 

The eligible participants were informed at conferences, 
meetings, on a web page [61], by written notice on no- 
tice boards, and by a personal letter administered by the 
hospital local post distribution, which gave the partici- 
pants the opportunity to make an informed decision 
about their participation in the trial. The eligible partici- 
pants could obtain more written information from our 
web page [61] and by contacting the principal investiga- 
tor or another contact person directly. After receiving 
written and verbal information, eligible participants were 
asked to sign a consent form before being enrolled in 
the trial. 

Discussion 

This is the first randomized trial investigating the effect of 
ISS versus OSS for SBME. An advantage of the trial is that 
it includes authentic teams of health-care professionals 
also involved in these clinical scenarios in real life. Several 
simulation-based studies are not performed on authentic 
teams and students have often been enrolled as they are 
more flexible and easier to include in trials. However, ap- 
plicability of these data is questionable as results based 
upon undergraduate students may not necessarily apply 
to postgraduate employed health-care workers. Includ- 
ing authentic teams will probably be advantageous when 
interpreting the results and drawing conclusions. 

However, the fact that authentic obstetric-anesthesia 
teams are trial participants - that is, fully employed health- 
care professionals - may carry feasibility problems. There 
will be a risk that situations arise in which real emergencies 
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combined with lack of staff necessitate that some of the 
randomized health-care professionals will need to discon- 
tinue the trial participation. Further, there is a minor risk 
that a full team randomized to ISS needs to discontinue if 
a real life emergency situation necessitates the use of the 
rooms in the labor ward and operating theatre that were 
allocated to the trial for the day. Through our careful plan- 
ning and cooperation with the managerial teams of the in- 
volved departments, this risk will be minimized. 

A potential weakness is the fact that the trial is a single 
site trial, including only a moderate number of partici- 
pants. There wOl also be a risk of contamination amongst 
teams, as the health-care professionals in the experimental 
team (ISS) intermingle with staff members allocated to the 
control group (OSS) and may share information. This may 
affect the generalizabUity of this study. Moreover, this trial 
only assesses surrogate outcomes for the relevant clinical 
outcome, that is, whether neonatal and maternal health 
fares better with ISS compared with OSS. However, being 
the first randomized trial comparing ISS with OSS, the 
trial has the potential to add some new insight with 
regards to the effect of authenticity in the setting for 
SBME and to inform future research in this field. 

The sample size estimation has been based on data 
from other knowledge tests [13,32], as there are no 
current data on knowledge testing, and on the effect of 
training effectiveness of ISS versus OSS. The sample size 
calculation is adjusted for clustering. However, we have 
no prior information about the ICC, and therefore this 
estimation is based on general recommendations [57,58]. 

The primary outcome is a knowledge test. As alluded 
to above, it would have been more optimal to have neo- 
natal and maternal health as clinical outcomes. However, 
this is not possible in the present trial, as a very high 
number of deliveries will be required to directly measure 
patient-relevant outcomes in obstetrics [62]. However, 
there are educational studies indicating that a perform- 
ance in a written knowledge test can relate to clinical 
performance in practice [63]. 

Given the nature of the trial, it will not be possible to 
blind the participants, the educators providing the edu- 
cational intervention, or the assessors observing and 
assessing videos. This will give a risk of overestimating 
the beneficial effects of the experimental intervention 
[64,65]. However, the allocated intervention group will 
be blinded for the data managers, statisticians and inves- 
tigators drawing conclusions, and we will consider the 
risks of bias when drawing conclusions. 

This trial can bring new information on SBME. The 
simulation setting has traditionally been OSS; however, 
an unanswered question is which advantages, if any, ISS 
can add to learning. Randomized trials are needed to ob- 
tain knowledge of advantages and disadvantages of ISS 
versus OSS. The study can potentially also inform the 



theory of fidelity of simulation [16]. The results of this 
trial may also add knowledge to inform the political 
planning and decision making process during rebuilding 
and building of hospitals and simulation centers. It is 
important to know whether high-fidelity simulation cen- 
ters should be prioritized as opposed to designing/build- 
ing simulation rooms 'in situ' for future simulation- 
based education. 

Trial status 

Planning of the trial was initiated in January 2012. Enrol- 
ment of participants was initiated in January 2013. The 
intervention is scheduled to start in April 2013 and will 
stop in June 2013. Follow-up by questionnaires will con- 
tinue until August 2013. 
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